Search CORE

45 research outputs found

A Self-Organizing Algorithm for Modeling Protein Loops

Protein loops, the flexible short segments connecting two stable secondary structural units in proteins, play a critical role in protein structure and function. Constructing chemically sensible conformations of protein loops that seamlessly bridge the gap between the anchor points without introducing any steric collisions remains an open challenge. A variety of algorithms have been developed to tackle the loop closure problem, ranging from inverse kinematics to knowledge-based approaches that utilize pre-existing fragments extracted from known protein structures. However, many of these approaches focus on the generation of conformations that mainly satisfy the fixed end point condition, leaving the steric constraints to be resolved in subsequent post-processing steps. In the present work, we describe a simple solution that simultaneously satisfies not only the end point and steric conditions, but also chirality and planarity constraints. Starting from random initial atomic coordinates, each individual conformation is generated independently by using a simple alternating scheme of pairwise distance adjustments of randomly chosen atoms, followed by fast geometric matching of the conformationally rigid components of the constituent amino acids. The method is conceptually simple, numerically stable and computationally efficient. Very importantly, additional constraints, such as those derived from NMR experiments, hydrogen bonds or salt bridges, can be incorporated into the algorithm in a straightforward and inexpensive way, making the method ideal for solving more complex multi-loop problems. The remarkable performance and robustness of the algorithm are demonstrated on a set of protein loops of length 4, 8, and 12 that have been used in previous studies

Crossref

Directory of Open Access Journals

PubMed Central

Amino acid "little Big Bang": Representing amino acid substitution matrices as dot products of Euclidian vectors

Author: A Kidera
A Kinjo
A Kinjo
A Marin
DK Agrafiotis
E Ollivier
F Fogolari
G Golub
J Méndez
J Méndez
Jean-François Gibrat
K Tomii
Karel Zimmermann
M Dayhoff
M Delorme
M Wall
MO Delorme
O Alter
O Bastien
R Durbin
R Swanson
S Altschul
S Gu
S Henikoff
S Kawashima
S Maetschke
V Biou
W Press
W Xu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Sequence comparisons make use of a one-letter representation for amino acids, the necessary quantitative information being supplied by the substitution matrices. This paper deals with the problem of finding a representation that provides a comprehensive description of amino acid intrinsic properties consistent with the substitution matrices. Results We present a Euclidian vector representation of the amino acids, obtained by the singular value decomposition of the substitution matrices. The substitution matrix entries correspond to the dot product of amino acid vectors. We apply this vector encoding to the study of the relative importance of various amino acid physicochemical properties upon the substitution matrices. We also characterize and compare the PAM and BLOSUM series substitution matrices. Conclusions This vector encoding introduces a Euclidian metric in the amino acid space, consistent with substitution matrices. Such a numerical description of the amino acid is useful when intrinsic properties of amino acids are necessary, for instance, building sequence profiles or finding consensus sequences, using machine learning algorithms such as Support Vector Machine and Neural Networks algorithms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

HiTSEE KNIME: a visualization tool for hit selection and analysis in high-throughput screening experiments for the KNIME platform

Author: A Bender
A Leach
B Xiong
C Borgelt
C Collins
DK Agrafiotis
Dorit Merhof
E De Clercq
E Lounkine
Enrico Bertini
Hendrik Strobelt
J Hamecher
J Larsson
J Zhang
Joachim Braun
M Catarinella
MR Berthold
N Nikolova
Oliver Deussen
OS Weislow
P Willett
R Hertzberg
S Wetzel
T Mayer
T Meinl
T Tanimoto
Thomas U Mayer
Ulrich Groth
WS Cleveland
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

We present HiTSEE (High-Throughput Screening Exploration Environment), a visualization tool for the analysis of large chemical screens used to examine biochemical processes. The tool supports the investigation of structure-activity relationships (SAR analysis) and, through a flexible interaction mechanism, the navigation of large chemical spaces. Our approach is based on the projection of one or a few molecules of interest and the expansion around their neighborhood and allows for the exploration of large chemical libraries without the need to create an all encompassing overview of the whole library. We describe the requirements we collected during our collaboration with biologists and chemists, the design rationale behind the tool, and two case studies on different datasets. The described integration (HiTSEE KNIME) into the KNIME platform allows additional flexibility in adopting our approach to a wide range of different biochemical problems and enables other research groups to use HiTSEE

KOPS - The Institutional Repository of the University of Konstanz

Crossref

Springer - Publisher Connector

PubMed Central

CheS-Mapper - Chemical Space Mapping and Visualization in 3D

Author: A Maunz
Andreas Karwath
B Hardy
C Steinbeck
DH Fisher
DK Agrafiotis
E Papa
G Patlewicz
J Oksanen
JD Leeuw
JJW Sammon
KR Przybylak
L van der Maaten
M Hall
M Seeland
M Wawer
Martin Gütlein
N Jeliazkova
N O'Boyle
NL Allinger
P Langfelder
R Development Core Team
R Guha
S Dasgupta
Stefan Kramer
Susan Schiffman FWY M Lance Reynolds
T CaliÅ„ski
TA Halgren
TJ Hou
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

Gutenberg Open

Spike pattern recognition by supervised classification in low dimensional embedding space

Author: A Globerson
AA Dingle
AJ Gabor
AT Tzallas
B Ramabhadran
BL Davey
C Kurth
CJ James
DK Agrafiotis
EI Zacharaki
EI Zacharaki
F Sartoretto
FI Argoud
G Fischer
H Goelz
H Witte
HS Liu
HS Park
International Federation of Societies for Clinical Neurophysiology
IT Jolliffe
J Gotman
J Zhang
JJ Halford
KJ Staley
KP Indiradevi
L Senhadji
M Adjouadi
M Adjouadi
M Belkin
M Feucht
M Latka
M Lucia De
N Acir
O Ozdamar
P Hese Van
RR Coifman
SB Wilson
SB Wilson
T Sugi
T Zhang
TP Exarchos
WE Hostetler
WR Webber
X He
ZH Inan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

© The Author(s) 2016. This article is published with open access at Springerlink.com under the terms of the Creative Commons Attribution License 4.0, (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.Epileptiform discharges in interictal electroencephalography (EEG) form the mainstay of epilepsy diagnosis and localization of seizure onset. Visual analysis is rater-dependent and time consuming, especially for long-term recordings, while computerized methods can provide efficiency in reviewing long EEG recordings. This paper presents a machine learning approach for automated detection of epileptiform discharges (spikes). The proposed method first detects spike patterns by calculating similarity to a coarse shape model of a spike waveform and then refines the results by identifying subtle differences between actual spikes and false detections. Pattern classification is performed using support vector machines in a low dimensional space on which the original waveforms are embedded by locality preserving projections. The automatic detection results are compared to experts’ manual annotations (101 spikes) on a whole-night sleep EEG recording. The high sensitivity (97 %) and the low false positive rate (0.1 min−1), calculated by intra-patient cross-validation, highlight the potential of the method for automated interictal EEG assessment.Peer reviewedFinal Published versio

HAL-CentraleSupelec

Crossref

Springer - Publisher Connector

INRIA a CCSD electronic archive server

PubMed Central

University of Hertfordshire Research Archive

HAL-Rennes 1

DPRESS: Localizing estimates of predictive uncertainty

Author: A Bassan
A Giuliani
A Golbraikh
A Golbraikh
B Bush
BL Bush
C Hansch
C Hansch
DE Johnson
DK Agrafiotis
DM Hawkins
GW Snedecor
IV Tetko
J Gasteiger
JD Walker
JJ Sutherland
K Baumann
K Baumann
K Faber
KK Kim
L He
M Haranczyk
M Seel
MC Denham
MJ Embrechts
P Chavatte
PC Jurs
R Benigni
R Guha
RD Clark
RD Clark
RD Clark
RD Clark
RD Clark
RD Clark
RD Clark
RD Cramer III
RD Cramer III
RE Kleinknecht
Robert D Clark
RP Sheridan
RT Kroemer
S Weaver
S Wold
S Wold
T Morsing
T Schultz
TS Schroeter
TW Heritage
W Tong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The need to have a quantitative estimate of the uncertainty of prediction for QSAR models is steadily increasing, in part because such predictions are being widely distributed as tabulated values disconnected from the models used to generate them. Classical statistical theory assumes that the error in the population being modeled is independent and identically distributed (IID), but this is often not actually the case. Such inhomogeneous error (heteroskedasticity) can be addressed by providing an individualized estimate of predictive uncertainty for each particular new object <it>u</it>: the standard error of prediction <it>s</it>u can be estimated as the non-cross-validated error <it>s</it>t* for the closest object <it>t</it>* in the training set adjusted for its separation <it>d </it>from <it>u </it>in the descriptor space relative to the size of the training set. <display-formula><graphic file="1758-2946-1-11-i1.gif"/></display-formula> The predictive uncertainty factor <it>γ</it>t* is obtained by distributing the internal predictive error sum of squares across objects in the training set based on the distances between them, hence the acronym: <it>D</it>istributed <it>PR</it>edictive <it>E</it>rror <it>S</it>um of <it>S</it>quares (DPRESS). Note that <it>s</it>t* and <it>γ</it>t*are characteristic of each training set compound contributing to the model of interest. Results The method was applied to partial least-squares models built using 2D (molecular hologram) or 3D (molecular field) descriptors applied to mid-sized training sets (<it>N </it>= 75) drawn from a large (<it>N </it>= 304), well-characterized pool of cyclooxygenase inhibitors. The observed variation in predictive error for the external 229 compound test sets was compared with the uncertainty estimates from DPRESS. Good qualitative and quantitative agreement was seen between the distributions of predictive error observed and those predicted using DPRESS. Inclusion of the distance-dependent term was essential to getting good agreement between the estimated uncertainties and the observed distributions of predictive error. The uncertainty estimates derived by DPRESS were conservative even when the training set was biased, but not excessively so. Conclusion DPRESS is a straightforward and powerful way to reliably estimate individual predictive uncertainties for compounds outside the training set based on their distance to the training set and the internal predictive uncertainty associated with its nearest neighbor in that set. It represents a sample-based, <it>a posteriori </it>approach to defining applicability domains in terms of localized uncertainty.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Structure-based classification and ontology in chemistry

Abstract Background Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving <it>relevant </it>results from the available information, and <it>organising </it>those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. Results We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. Conclusion Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.</p

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

The University of Manchester - Institutional Repository

Userscripts for the Life Sciences

Author: A Herráez
AB Majumder
BM Good
C Knox
Christoph Steinbeck
David J Wild
Dazhi Jiao
DK Agrafiotis
E Willighagen
Egon L Willighagen
Harini Gopalakrishnan
HM Berman
JA Fox
JA Townsend
KV Mardia
M Ashburner
M Karthikeyan
MD Wilkinson
MY Galperin
Noel M O'Boyle
O Spjuth
P Corbett
P Ertl
R Guha
Rajarshi Guha
SJ Coles
T Etzold
T Lee
X Dong
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Natural products in modern life science

Author: A Backlund
A Bendesky
A Herrmann
A Herrmann
A Murakami
AB Klitgaard
Anders Backlund
Angiosperm Phylogeny Group I
Angiosperm Phylogeny Group II
B Booth
B Loftus
B Rannala
BK Shoichet
C Ekenäs
C Jennings
C Walsh
C Wedén
C Wedén
CA Lipinski
Cecilia Alsmark
CG Bologa
CG Clark
Christina Wedén
CK Wang
CM Alsmark
CM Dobson
DC Ireland
DG Hall
DG Kingston
DJ Kim
DJ Newman
DK Agrafiotis
DL Hawksworth
DS Hibbett
E Haeckel
E Hedner
E Hedner
E Svangård
E Svangård
G Samuelsson
GJ Celio
GM Mueller
H El-Seedi
H Kamimori
H Kubinyi
H Lodish
H Ochman
I Andersson
I Påhlsson
J Larsson
J Larsson
J Lei
J Pettersson
J Pettersson
J Rosén
J Rosén
JA DiMasi
JD Holliday
JG Bruhn
JHv Drie
JM Carlton
JM Trappe
JP Huelsenbeck
JP Tam
KJ Rosengren
L Bohlin
L Gran
L Gran
L Gran
Lars Bohlin
M Berriman
M Pucheault
M Sjögren
MA Ragan
MD Rawlins
ML Colgrave
ML Colgrave
MP Cummings
MR Plan
MV Kapralov
NM El-Sayed
NR Augustine
O Saether
P Bernard
P Claeson
P Kirkpatrick
P Lindholm
P Seydel
P Willett
P Willett
R Claus
R Traber
RE Carhart
RP Hirt
RS Bohacek
S Gunasekera
S Gunasekera
S Whelan
SJ Haggarty
SL Cudmore
SM Simonsen
SN Pollock
T Blundell
T Cech
T Leta Aboye
T Læssøe
T Sicheritz-Pontén
TC Taylor
TI Oprea
TI Oprea
U Göransson
U Göransson
U Göransson
U Göransson
U Göransson
U Göransson
U Huss
Ulf Göransson
WF Doolittle
YC Martin
Z Yang
ZO Shenkarev
Publication venue: Springer Netherlands
Publication date: 01/01/2010
Field of study

With a realistic threat against biodiversity in rain forests and in the sea, a sustainable use of natural products is becoming more and more important. Basic research directed against different organisms in Nature could reveal unexpected insights into fundamental biological mechanisms but also new pharmaceutical or biotechnological possibilities of more immediate use. Many different strategies have been used prospecting the biodiversity of Earth in the search for novel structure–activity relationships, which has resulted in important discoveries in drug development. However, we believe that the development of multidisciplinary incentives will be necessary for a future successful exploration of Nature. With this aim, one way would be a modernization and renewal of a venerable proven interdisciplinary science, Pharmacognosy, which represents an integrated way of studying biological systems. This has been demonstrated based on an explanatory model where the different parts of the model are explained by our ongoing research. Anti-inflammatory natural products have been discovered based on ethnopharmacological observations, marine sponges in cold water have resulted in substances with ecological impact, combinatory strategy of ecology and chemistry has revealed new insights into the biodiversity of fungi, in depth studies of cyclic peptides (cyclotides) has created new possibilities for engineering of bioactive peptides, development of new strategies using phylogeny and chemography has resulted in new possibilities for navigating chemical and biological space, and using bioinformatic tools for understanding of lateral gene transfer could provide potential drug targets. A multidisciplinary subject like Pharmacognosy, one of several scientific disciplines bridging biology and chemistry with medicine, has a strategic position for studies of complex scientific questions based on observations in Nature. Furthermore, natural product research based on intriguing scientific questions in Nature can be of value to increase the attraction for young students in modern life science

Crossref

Springer - Publisher Connector

PubMed Central

The ABCD of data management

Author: Dimitris K. Agrafiotis
DK Agrafiotis
Peter Kirkpatrick
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref